Context-based Arabic Morphological Analysis for Machine Translation
نویسندگان
چکیده
In this paper, we present a novel morphology preprocessing technique for ArabicEnglish translation. We exploit the Arabic morphology-English alignment to learn a model removing nonaligned Arabic morphemes. The model is an instance of the Conditional Random Field (Lafferty et al., 2001) model; it deletes a morpheme based on the morpheme’s context. We achieved around two BLEU points improvement over the original Arabic translation for both a travel-domain system trained on 20K sentence pairs and a news domain system trained on 177K sentence pairs, and showed a potential improvement for a large-scale SMT system trained on 5 million sentence pairs.
منابع مشابه
Arabic-to-English Example Based Machine Translation Using Context-Insensitive Morphological Analysis
W e describe and discuss the results of ongoing experim ents that use morphological analysis in the context of Example-Based M achine Translation. The goal is to increase the coverage of our training examples so as to capture things that are not directly seen in the training text. This is done through a two stage process of generalization and filtering.
متن کاملDeveloping a New Approach for Arabic Morphological Analysis and Generation
Arabic morphological analysis is one of the essential stages in Arabic Natural Language Processing. In this paper we present an approach for Arabic morphological analysis. This approach is based on Arabic morphological automaton (AMAUT). The proposed technique uses a morphological database realized using XMODEL language. Arabic morphology represents a special type of morphological systems becau...
متن کاملArabic Roots Extraction Using Morphological Analysis
The Arabic language is characterized by its rich and complex morphology based on root-pattern schemes. Root extraction is one of the most important topics in the context of natural language processing applications such as information retrieval, text processing, machine translation, speech tagging, etc. This paper presents a method to extract the trilateral roots of Arabic words, acting from the...
متن کاملDeveloping a New System for Arabic Morphological Analysis and Generation
Arabic morphology poses special challenges to computational natural language processing systems. Its rich morphology and the highly complex word formation process of roots and patterns make computational approaches to Arabic very challenging. In this paper we present an approach for morphological analysis and generation of Modern Standard Arabic (MSA). Our approach is based on Arabic morphologi...
متن کاملMorphological Analysis and Generation for Machine Translation from and to Arabic
In this paper, we present machine translation importance and the need of a linguistic treatment for the transfer based approach, then we present our method in analysis and generation based on linguistic features of Arabic word, dealing with scheme concept; to extract morphological information, these information is very useful in tree generation and structural transfer.
متن کامل